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MULTIPLE-RELAXATION-TIME LATTICE BOLTZMANN MODELS IN 3D 


DOMINIQUE D’HUMIERES* , IRINA GINZBURG 1 , MANFRED KRAFCZYK*, PIERRE LALLEMAND 1 , AND 

LI-SHI LUO' 

Abstract. This article provides a concise exposition of the multiple- relaxation-time lattice Boltzmann 
equation, with examples of fifteen-velocity and nineteen-velocity models in three dimensions. Simulation of a 
diagonally lid-driven cavity flow in three dimensions at Re = 500 and 2000 is performed. The results clearly 
demonstrate the superior numerical stability of the multiple-relaxation-time lattice Boltzmann equation over 
the popular lattice Bhatnagar-Gross-Krook equation. 

Key words, multiple-relaxation-time LBE in 3D, D3Q15 and D3Q19 models, 3D diagonal lid driven 
cavity flow 

Subject classification. Fluid Mechanics 

1. Introduction. The relaxation lattice Boltzmann equation (RLBE) was introduced by Higuera and 
Jimenez [20] to overcome some drawbacks of lattice gas automata (LG A) such as large statistical noise, limited 
range of physical parameters, non-Galilean invariance, and implementation difficulty in three dimensions. In 
the original RLBE the equilibria and the relaxation matrix were derived from the underlying LGA models. It 
was soon realized that the connection to the LGA model could be abandoned and the equilibria and collision 
matrices could be constructed independently to better suit numerics [21]. 

The simplest lattice Boltzmann equation (LBE) is the lattice Bhatnagar-Gross-Krook (BGK) equation, 
based on a single-relaxation-time approximation [1], Due to its extreme simplicity, the lattice BGK (LBGK) 
equation [29, 4] has become the most popular lattice Boltzmann model in spite of its well known deficiencies. 

The multiple-relaxation-time (MRT) lattice Boltzmann equation was also developed at the same time [7]. 
The MRT lattice Boltzmann equation (also referred to as the generalized lattice Boltzmann equation (GLBE) 
or the moment method) overcomes some obvious defects of the LBGK model, such as fixed Prandtl number 
(Pr = 1 for the BGK model) and fixed ratio between the kinematic and bulk viscosities. The MRT lattice 
Boltzmann equation has been persistently pursued, and much progress has been made. Successes include: 
formulation of optimal boundary conditions [10, 22], interface conditions in multi-phase flows [11] and free 
surfaces [12], thermal [26] and viscoelastic models [13, 14], models with reduced lattice symmetries [8, 2], 
and improvement of numerical stability [23]. It should be stressed that most of the above results cannot be 
obtained with the LBGK models. Applying optimization techniques in coding, the computational efficiency 
of the RLBE method [7] can be fairly close to that of the LBGK method for most practical applications 
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(RLBE schemes could be ca. 15% slower than their LBGK counterparts in terms of the number of sites 
updated per second). Recently it was shown that the multiple-relaxation-time LBE models are much more 
stable than their LBGK counterparts (Lallemand & Luo 2000), because the different relaxation times can 
be individually tuned to achieve ‘optimal’ stability. 

In this paper we intend to bring attention to the multiple-relaxation-time LBE by demonstrating its 
superior stability to LBGK models. This paper is organized as follows. Section 2 briefly outlines the basic 
theory of the multiple-relaxation-time LBE. Section 3 provides as a ‘template’ example: a fifteen-velocity 
RLBE model in three dimensions (D3Q15 model). Section 4 gives some results for a three-dimensional 
cavity flow simulated by using both RLBE and LBGK methods. Finally, section 5 concludes the paper. The 
appendix briefly describes the nineteen- velocity model in three dimensions (D3Q19 model). 

2. Multiple-relaxation-time lattice Boltzmann equation. Although it can be shown that the 
lattice Boltzmann equation is a finite difference form of the linearized continuous Boltzmann equation [17, 18], 
we present RLBE as a self-contained mathematical object representing a dynamical system with a finite 
number of moments in discrete space and time. 

The general RLBE model has three components. The first component is discrete phase space defined 
by a regular lattice in D dimensions together with a set of judiciously chosen discrete velocities {e a \a = 
0, 1, . . . , N} connecting each lattice site to some of its neighbours. The fundamental object in the theory is 
the set of velocity distribution functions {f a |ct = 0, 1, . . . , A r } defined on each node r* of the lattice. The 
second component includes a collision matrix S and (N + 1) equilibrium distribution functions {/« |ai = 
0, 1, . . . , N}. The equilibrium distribution functions are functions of the local conserved quantities. The 
third component is the evolution equation in discrete time t n = nSt, n = 0, 1, . . . , 

\f(ri + e a 5t, t + St)) - | /(r;, t)) = -S [ |/(r ; , t)) - |/ (eq) (r i; t)) ] . (2.1) 

In the above equation we have used the following notations for column vectors in (N + l)-dimensional space 
¥ = R n+1 , 

I f(ri, t)) = (foiru t), /i (r it t), ..., f N (ri, tj) T , 

|/ teq) (r i; t)) = (/o eq) (n, t), /| eq) (ri, t), ..., /^ q) (rq t)) T , and 
| f(ri + e a St, t + St)) = (fo(ri, t + St), fN(ri + ewSt, t + St)) T , 

where T denotes the transpose operator and we always assume that eo = 0. From here on the Dirac 
notations of bra (j and ket [•) vectors are used to denote respectively the row and column vectors. Note 
that equation (2.1) is written in such a way that its right- and left-hand-sides represent the two elementary 
steps in the evolution of the lattice Boltzmann equation: advection and collision. The advection process is 
naturally executed in velocity space V, f a (r j , t) being simply shifted in space according to the velocity e a to 
fa(ri + e a St, t + St). The collision process is naturally accomplished in the space spanned by the eigenvectors 
of the collision matrix, the corresponding eigenvalues being the inverse of their relaxation time towards their 
equilibria. The ( N + 1) eigenvalues of S are all between 0 and 2 so as to maintain linear stability and the 
separation of scales, which means that the relaxation times of non-conserved quantities are much faster than 
the hydrodynamic time scales. The LBGK models are special cases in w r hich the ( N + 1) relaxation times 
are all equal, and the collision matrix S = u>l, where I is the identity matrix, io = 1/r, and t (> 1/2) is the 
single relaxation time of the model. 

To simulate athermal fluids, a necessary criterion is that the discrete velocity set must be sufficient to 
represent a scalar (mass density p), a vector (momentum j), another scalar (pressure P), and a symmetric 
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traceless second rank tensor (viscous stress tensor cry). More generally, the velocity set must possess sufficient 
symmetries for the hydrodynamic equations to hold: the conserved quantities and their fluxes must transform 
properly so that they can approximate their continuous counterparts in an appropriate limit. Finally the 
local conserved quantities must be the mass density p and the momentum j for athermal fluids. 

Given a chosen set of discrete velocities {e a |o: = 0, 1. . . . , N} and corresponding distribution functions 
{f a \a = 0, 1, . . . , A r ), an equal number of moments {nip \0 = 0, 1, . . . , N} of the distribution functions f a 
can be obtained as 


mp = (<f>d\f) = </ |<Ae)> (/I = (fo, fi, ■■■, In), 


( 2 . 2 ) 


where {\0p)\0 = 0, 1, . . . , N} is an orthogonal dual basis set constructed by the Gram-Schmidt orthogonal- 
ization procedure ( e.g . Bouzidi et al. 2001a) from polynomials of the column vectors \e Xi ) in space V. Vector 
\e Xi ) is built from the components of the e a ’s, i.e. \e Xi ) = (eo Xi , ei x; , . . . , ejv x ,) T , for i € {1, 2, ... , D} in 
D dimensions (e.g., {|e x ), |e y ), \e z )} in three dimensions). 

The set {\<t>p)} is analogous to the Hermite tensor polynomials in continuous velocity space [15, 16]. 
It should be stressed that the orthogonal functions defined on a finite set of discrete velocities {e a } has 
some degeneracies which do not exist in the Hermite tensor polynomials in continuous space. Obviously, 
the moments are simply linear combinations of the / n ,’s, therefore velocity space V, spanned by |/) = 
(/„, fi, ■■■, M T , and the moment space M, spanned by | m) = (mo, mi, ..., mjv) T , are related by a 
linear mapping M: |m) = M|/) and |/) = M — 1 1 ///,) . The transformation matrix M would be an orthogonal 
transformation if the basis vectors {{(fip)} are normalized. 

If the matrix S is chosen such that the {{(bp)} are its eigenvectors, the linear relaxation of the kinetic 
modes in moment space M naturally accomplishes the collision process. Then the evolution equation (2.1) 
of the multiple-relaxation-time lattice Boltzmann equation becomes [7, 23]: 

I f(ri + e a 6t, t + St)} - \f(n, t)) = — M -1 S [ |m(r £ , t)) - |m (eq )( ri , t)) ] , (2.3) 

where the collision matrix S = M - S- M _1 is diagonal: S = diag(so, si, • • • , sn ), and mL eq) is the equilibrium 
value of the moment m a - The (N + 1) moments can be separated into two groups: the ‘hydrodynamic’ 
(conserved) moments and the ‘kinetic’ (non-conserved) moments. The first group consists of the moments 
locally conserved in the collision process, so that in general mj ? eq) = mp. The second group consists of the 
moments not conserved in the collision process so that mjj / m,}. The equilibria {m^ eqJ } are functions 
of the conserved moments and are invariant under the symmetry group of the underlying lattice. For 
models designed to simulate athermal fluids, the only hydrodynamic moments are mass density p (a scalar) 
and momentum j (a vector): energy is not a conserved quantity for athermal fluids. Equilibrium values of 
kinetic moments are functions of p and ||j[| 2 for scalars, and j times some functions of p and ||j|| 2 (eventually 
a constant) for vectors, and so on, as discussed in §3. 

From the above definition of the conserved and non-conserved moments, Eq. (2.3) can be rewritten as 


/ (r i + e a St, t + St)) - | f(r i} t)) 


-E 


j^^[(mp(ri, t) - m^ q) (ri, t))] \<j> 0 ), 


(2.4) 


where we have used the fact that (M ■ M T ) is a diagonal matrix with diagonal elements (4>p\4>s). It is obvious 
that the actual values of the sp’s for conserved moments are irrelevant, because m^ eq ^(ri, t) = mp(ri, t) for 
0 G bv definition, but they are set to zero in general in what follows. Note that this point is purely 
academic in the present context, but is very important when including body forces, as shown by Ginzbourg 
and Adler [10]. 
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The RLBE formulation has two important consequences. First, one has the maximum number of ad- 
justable relaxation times, one for each class of kinetic modes invariant under the symmetry group of the 
underlying lattice. Second, one has maximum freedom in the construction of the equilibrium functions of 
the non-conserved moments. One immediate result of using the RLBE instead of the LBGK model is a 
significant improvement in numerical stability [23]. It should be emphasized that the above procedures are 
general and are independent of the number of discrete velocities and the number of space dimensions. 

3. Multiple-relaxation-time D3Q15 model. Each point on a unit cubic lattice space has six nearest 
neighbours, (±1, 0, 0), (0, ± 1 , 0), and (0, 0, ±1), twelve next nearest neighbours, (±1, ±1, 0), (±1, 0, ±1), 
and (0, ±1, ±1), and eight third nearest neighbours, (±1, ±1, ±1). Elementary discrete velocity sets for 
lattice Boltzmann models in three dimensions are constructed from the set of twenty-six vectors pointing 
from the origin to the above neighbours and the zero vector (0,0,0). The twenty-seven velocities are usually 
grouped into four subsets labelled by their squared modulus, 0, 1,2, and 3. We also use the notation DdQq 
for the g-velocity model in d-dimensional space in what follows [29]. The D3Q15 model uses the velocity 
subsets 0, 1, and 3 and is described here as an example. The D3Q19 model uses the subsets 0, 1, and 2 and 
is described in the appendix. The D3Q13 model introduced by d’Humieres et al. [8] only uses the subsets 0 
and 2. 

The fifteen discrete velocities in the D3Q15 model are 

[ (0, 0, 0), a = 0, 

e Q = < (±1, 0, 0), (0, ±1, 0), (0, 0, ±1), a = 1, 2, . . . , 6, (3.1) 

( (±1, ±1, ±1), a = 7, 8, ..., 14. 

The components of the corresponding fifteen orthogonal basis vectors \4>g) a are given by: 


0 o)a = |M|°, 

0i}a = ||e Q || 2 - 2, 

02>a = |(15||e Q || 4 -55||e a || 2 + 32), 


03) a = j 

05 )a = ^ocyi r 

07 )a = £<xzi J 

04 ) a = |(5||e a || 2 - 13)e ax , 1 
06>« = |(5||e a || 2 - 13)e a y, > 
08 > a = 5(5||e a || 2 - 13)e az , J 
09>a = 3e 2 x - [|e a || 2 , | 

010) a ^ ay ~ ^azi J 

011) a — ^ax^ay: 

012) a — Gay^azi * 

013) a = ('ax^az', , 

014) a — t 'ax^-ay^azi 


(3.2a) 


(3.2b) 


(3.2c) 


(3. 2d) 


(3.2e) 


(3.2f) 


where a G {0, 1, ... ,14}, ||e a || = (e 2 x + e 2 ^ + e 2 .) 1 ^ 2 and ||eo||° = 1. The orthogonal basis set {] 0^)} 
is obtained by orthogonalizing the polynomials of the column vectors \e Xi ) by the standard Gram-Schmidt 
procedure ( e.g . [2])). The corresponding fifteen moments {m^l/3 = 0, 1, ..., 14} are: the mass density 
(mo = p), the part of the kinetic energy independent of the density (mi = e), the part of the kinetic energy 
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square independent of the density and kinetic energy ( m 2 = e = e 2 ), the momentum ( 7713 , 5,7 = jx,y,z)i the 
energy flux independent of the mass flux (m. 4 , 6.8 = Qx,y, z ), the symmetric traceless viscous stress tensor 
(m 9 = 3p xx , mio = p ww = p vv - p zz , with p xx + p yy + p z - = 0, mn,i 2 ,i 3 = Pxy,yz,zx ), and an antisymmetric 
third-order moment (mi 4 = m xyz ), corresponding to the following order: 

|m> = (/», e, e, j y , q v , j z ,q z , 3p xx ? PwWi Pxyi Pyzj Pzx m , X yz ) • 

The collision matrix S in moment space M is the diagonal matrix 

S = diag(0, s 1 , s 2 , 0, s 4 , 0, a 4 , 0, s 4 , s 9 , s 9 , s n , sn, Sn, s x4 ), (3.3) 

zeros corresponding to conserved moments in the order chosen. And the matrix M is then given by 

111111111111111 
-2 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 1 1 

16 -4 -4 -4 -4 -4 -4 1 1 1 1 1 1 1 1 

0 1 -1 0 0 0 0 1 -1 1 -1 1 -1 1 -1 

0 -4 4 0 0 0 0 1 -1 1 -1 1 -1 1 -1 

0 0 0 1 -1 0 0 1 1 -1 -1 1 1 -1 -1 

0 0 0-4 4 0 0 1 1-1-1 1 1-1-1 

M = 0 0 0 0 0 1 -1 1 1 1 1 -1 -1 -1 -1 . (3.4) 

0 0 0 0 0-4 4 1 1 1 1 -1 -1 -1 -1 

022 -1 -1 -1 -1 00000000 

000 1 1 -1 -1 00000000 

0 0 0 0 0 0 0 1 -1 -1 1 1 -1 -1 1 

000000011-1-1-1-111 
00000001-1 1-1-1 1 -1 1 
0 0 0 0 0 0 0 1 -1 -1 1 -1 1 1 -1 

Note that the row vectors of M, {(tf> 0 1 }, are orthogonal to each other but they are not normalized, i.e. 

= ||0a|| • Ik'.'il • 5 a 0 . Note also that, with different ordering and normalization, the basis vectors 
{ |0fe)} given by Ginzburg [9] are the same as the ones given here, except \cpi) and |^ 2 ) which are replaced by 
an orthogonal linear combination. This would make a difference only when .sq ^ a 2 . The 4th, 6th, and 8th 
row vectors of M (corresponding to j x , j y , and j z , respectively) uniquely define the ordering (or labelling) 
of the velocity set { e a } in subscript a. 

With c 2 = 1/3 (c« is the sound speed) and s 9 = sn, the equilibria of the kinetic moments as functions 


of pl eq ' = p and j ( eq ) = j up to second-order are given by 

e (eq) = ~/>+ —j j = ~P+ —(jx + j 2 v + jl ), (3.5a) 

Po Po 

e (eq) = - p , (3.5b) 

li eq) = -\jx, 4 eq> = ~\hn «i eci) = - \iz, (3.5c) 

Pxx ] = ^ 7 - [2 jl - (jy + j 2 )] > Pww = -7 [jy ~ jl] , ( 3 -5d) 

dp 0 pO 

Pxy ] = 7 - jxjy , P^ q) = ^~jyjz, Px? = \ ~jxjz , (3.5e) 

Po Po Po 

m^ q > = 0. (3.5f) 


O 



The constants are defined as follows. The constant po is the mean density in the system and is usually 
set to be unity in simulations. The approximation of 1/p ss 1 / p 0 is used in equations (3.5a), (3.5d), and 
(3.5e) to reduce compressibility effects in the model (He & Luo 1997c). If the usual compressible Navier- 
Stokes equations are required, one only has to replace po by p. Equation (3.5b) has the general form 
£ (eq) = w ( p + w e j j ■ j / po , where w e and w e j are free parameters which do not have much effect on the 
asymptotic Navier-Stokes equation simulated by the model. In this model, we set w e = — 1 and w ej = 0; to 
recover the LBGK model, one must set w e = 1 and w e j = —5. 

The above equilibrium functions are obtained by optimizing the isotropy and Galilean invariance of the 
model. The details are described in [23]. The kinematic viscosity v and the bulk viscosity £ of the model are 


Hu 


i 

Sll 


c = 


(5 - 9c g) 


(1 


2 

(1 


U'i 

2 ) 

~ 9 

Ui 

2 ) 


(3.6a) 

(3.6b) 


We emphasize that the above formulae are obtained under the conditions that S 9 = sn and of equa- 
tion (3.5c), which are the results of the optimization, and the mean fluid velocity V = 0 . Corrections to 
these transport coefficients for finite k and non-zero mean velocity V can be calculated from the solution of 
the linearized dispersion equation of the system, which is equivalent to the standard von Neumann stability 
analysis [23]. 

Some properties of the lattice Boltzmann equation are dictated by the symmetries of the discrete velocity 
set and the simplicity of the dynamics on the lattice. One consequence is the existence of spurious invariants 
that may lead to some undesirable artifacts in simulations, especially near boundaries. One such invariant is 
the staggered invariant in LGA and LBE models [30]. The D3Q15 model has also another special invariant 
not found in most LBE models: the parity x( r i) of a vector n = (aq, y,, z{) defined on a three-dimensional 
cubic lattice by 


X(ri) = (ar» + yi + Zi) (mod 2), for n € Z 3 . (3.7) 

For the D3Q15 model, if x( r i) is 0 (r* € Z 3 ), then \(n + e a ) is 1 (r, € Z 3 ) for e a ^ 0, and vice versa. 
This means that the system has two decoupled sub-lattices (Z 3 and Z 3 ) for momentum, and these two sub- 
lattices can be coupled through boundary conditions. Consequently the system has a chequerboard (parity) 
invariance and one should be aware of this fact when using the D3Q15 models, especially when short-wave- 
length oscillations are observed in simulations. The oscillations due to the checkerboard invariance often 
causes numerical instability in simulations. In contrast, the D3Q19 model with velocities of parities 0 and 1 
does not have this checkerboard invariance. 

4. Simulations. In order to demonstrate the enhanced stability of the RLBE approach, we simulated 
a diagonally lid-driven cavity flow [28] with a flow configuration shown in figure 1. The mesh is uniform and 
of size 52 3 . The boundary condition (BC) at the top plane (at y = 1) is Unc = — (\/2, 0, \/2)/20, so that 
Ubc = 1 1 L 7 bc \ I = 0.1 in lattice units. The other five planes were subject to no-slip boundary conditions. 

The relaxation parameters used in the RLBE simulations are po = 1, Si = 1.6, s -2 = 1.2, .S 4 = 1.6, and 
S 14 = 1.2. The values of the relaxation parameters (s a ) and the adjustable parameters in e^ eq ^ (w ( and w e j) 
have been obtained to attain optimal numerical stability but can only be regarded as ‘sub-optimal’ values 
which are the result of a compromise between the expected range for the Reynolds number and the effort 
required to find the optimal values by searching a large parameter space through linear analysis. These 
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Fig. 1. Diagonally driven cavity flow 


parameters are not adjusted to the actual Reynolds number in each simulation, but are kept constant once 
chosen. The relaxation parameters sg = sn are determined by the viscosity from equation (3.6a). The 
accuracy of the simulation is also enhanced by using, instead of the variable p, its fluctuations Sp = p — po- 
The boundary conditions on the top plane are obtained in velocity space by assigning {/„} to [22] 


fa 


W a P0- 


e a ■ Ubc 


(4.1) 


where w a = 1/9 for a € {1, . . . , 6} and w a = 1/72 for a e {7,..., 14}. It should be stressed that this 
particular implementation of a sliding boundary imposes a constant pressure po = c 3 po at the boundary, 
which is incorrect; and the momentum transfer in the direction perpendicular to the moving lid is significantly 
weakened. The correct boundary conditions consistent with the bounce-back boundary conditions should be 
[24, 25, 3]: 


t e , 0 ... . e, 5 ■ Ubc t 0 e a -U D c 

fa — Ja “I" 2w a Po 2 — /c* 2,W a pO 2 > 


(4.2) 


where f & is the distribution function of e a = -e a . Nevertheless, the implementation prescribed by equation 
(4.1) does help to enhance the stability of the D3Q15 model. The ‘node’ bounce-back boundary conditions 
are applied to the rest five walls for no-slip boundary conditions [6]. The ‘node’ bounce-back boundary 
conditions differ from the ‘link’ bounce-back boundary conditions by a one-step delay in time but otherwise 
they are the same. This one-time-step delay seems to effectively reduce oscillations caused by the parity 
invariance and thus enhances the numerical stability [5]. 

As the effective width of the system is approximately 50 lattice units, the Reynolds number Re = 
50 Ubc I 11 was set by varying the viscosity v. We computed the lower bounds of the viscosity for this 
particular flow by using the RLBE and LBGK schemes. The lower bounds are 0.6 • 10~ 3 for RLBE scheme 
and 2.5 -10 -3 for LBGK scheme with the identical discretization, initial and boundary conditions. Viscosities 
smaller than these bounds would lead to numerical instability in the simulation. Hence for our test problem 
with the same mesh size, the maximum Reynolds number attainable by using the RLBE scheme is about 
four-times that attainable using the LBGK scheme. 

For the Povitsky cavity flow [28] at a low Reynolds number Re = 500 (viscosity v = 0.01), the pressure 
field computed by the LBGK scheme shows severe oscillations throughout the entire computational domain, 
even in locations far away from the corner singularities, in contrast to the much smoother pressure field 
obtained by using the RLBE scheme, as depicted in figure 2. 

When the Reynolds number is increased to 2000, the solution obtained by using the RLBE scheme 
agrees reasonably well with that obtained by using the commercial software FLUENT with a non-uniform 
68 3 mesh [28], as shown in figure 3, even though the RLBE grid resolution is much coarser. At a relatively 
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Fig. 2. Cavity flow Re = 500 pressure contours at z — 0.5: l) RLBE, r) LBGK 


high grid Reynolds number Re* = U/v = 40, the pressure field still bears useful information, at least at 
some distance from the top corner singularities, as shown in figure 4. In contrast, the LBGK simulation 
at Re = 2000 did not converge due to severe oscillations. With further increase of the Reynolds number 
to 4000 (u = 0.00125), the flow field becomes unsteady and complex three dimensional vortex shedding are 
observed. Detailed analysis of the flow will be published elsewhere. 



Fig. 3. Cavity flow Re = 2000 stream lines at y = 0.5: l) ELBE 52 3 uniform grids, r) FLUENT 101 3 non-uniform grids. 


In the simulations, suitable coding techniques should be applied to optimize the computational efficiency 
of the code. First and foremost, one should not use matrix calculations in the transformations between space 
V and space M, instead, the transformations should be carried out explicitly using the formulae mapping |/} 
to | m) and vice versa (equations (2.2) and (2.4)). Secondly, the equilibria must be computed in moment space 
M and not in velocity space V: this is the reason why we do not provide the equilibrium distribution functions 
fa q ^. Thirdly, all the common sub-expressions should be computed only once. This can be achieved either 
by explicitly computing these sub-expressions as separate variables or by carefully putting them between 
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parentheses and trusting modern compilers to do themselves the sub-expression reduction. Various compiler 
optimization options can easily accomplish this. Finally the use of po = 1 instead of p avoids the need of a 
division in the calculations of the equilibria, and the use of Sp instead of p to increase numerical accuracy. 



1 11 21 30 40 50 


Fig. 4. Cavity flow Re = 2000 pressure contours at z = 0.5, RLBE 52 3 uniform grids 

By following these basic practices based on common sense, the number of operations required for the sim- 
ulation of the multiple-relaxation-time D3Q15 model can be reduced to less than 120 additions/subtractions 
and 40 multiplications per time step on a grid point, as opposed to 80 additions/subtractions and 40 mul- 
tiplications for the LBGK scheme with the same optimization effort. (The purpose of this counting is not 
to find the exact lower bounds, but only to have an estimate.) We would also like to stress that on modern 
computers the computational performance of lattice Boltzmann schemes is mostly limited by the available 
memory bandwidth and that rather soon the cost of local floating-point operations will be negligible. For 
instance, combining the collision and propagation steps into one loop would reduce about 1/3 of the time, 
because use of two loops doubles the memory access time. (However, this combination of loops is difficult to 
implement on vector machines.) With the optimization except the combination of collision and propagation 
together, the number of sites updated per second of the RLBE D3Q15 scheme for our test carried out on one 
node (8 processors) of a Hitachi SR-8000 parallel vector machine is about 1.76 ■ 10 7 as opposed to 2.06 • 10 7 
for the LBGK D3Q15 scheme: the RLBE scheme is about 17% slower than the LBGK counterpart. The 
achieved FLOPS rate is 3.18 GFLOPS for the RLBE scheme versus 2.70 GFLOPS for the LBGK scheme. 
However it is important to note that, with the same computational effort and near the limit of numerical 
stability, the results obtained by using the RLBE scheme is much more accurate than the results obtained 
by using the LBGK scheme which are contaminated by numerical instability. 

Free of the parity invariance, the D3Q19 RLBE model (see Appendix) further improves the stability. 
We have tested the D3Q19 RLBE model in the Povitsky cavity flow [28]. We used the ‘link’ bounce-back 
boundary conditions for the five walls and the correct boundary condition of equation (4.2) for the moving 
lid. With the same resolution of 51 3 , the results obtained by using the D3Q19 RLBE model are much more 
accurate than that obtained by using the D3Q15 RLBE model with different boundary conditions. This 
confirms the previous observation that the D3Q19 LBGK model is more stable than the D3Q15 LBGK 
model [27]. A further comparative study of these two RLBE models is left for future work. 

5. Conclusions. In this paper we provide a synopsis of the multiple-relaxation-time LBE in three 
dimensions and demonstrate its superior numerical stability and efficiency through the simulation of the 
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diagonally lid-driven cavity flow in three dimensions. The flow is geometrically simple, steady, and yet 
non-trivial. For this flow we estimate that the improvement in stability brought by the RLBE scheme 
yields an about four-fold gain in maximum Reynolds number when compared to the LBGIv scheme. Of 
course, this improvement would be flow and boundary and initial condition dependent. Given that the 
computational effort required to solve time-dependent flows in three dimensions is basically proportional to 
Re 3 , the stability improvement by using the RLBE scheme would reduce the computational effort by at least 
one order of magnitude while maintaining the accuracy of the simulations. 
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Appendix A. Multiple-relaxation-time D3Q19 model. 

The nineteen discrete velocities in D3Q19 models are: 


( 


Ca — S 


( 0 , 0 , 0 ), 

(± 1 , 0 , 0 ), ( 0 , ± 1 , 0 ), ( 0 , 0 , ± 1 ), 

(± 1 , ± 1 , 0 ), (± 1 , 0 , ± 1 ), ( 0 , ± 1 , ± 1 ), 


a = 0, 

a = 1, 2, . . . , 6, 
a = 7, 8, . . . , 18, 


(A.l) 


and the components of the nineteen orthogonal basis vectors are given by 


0 o)a = |l e a|l° = 1, 

0i>« = 19||e a || 2 — 30, 

02>a = (21||e a || 4 — 53||e a || 2 + 24)/2, 

03 )a = ^ axi 

05 )a = £q.(/j 
07) a = €az: 


0r)a — (5||e a ||“ 

06)a = (5||e a ||“ — 9)e a y, * 

08>a = (5||e Q II' 2 - 9)e az , , 

09 > a = 3e 2 x - ||e a || 2 , j 
01l)a £ ay J 

013) a = ^ax^-ay: 

014) a = €ay€-azi 

015) a — axeaz •, 

0io>„ = (3||e a || 2 - 5) (3e 2 s - IK|| 2 ), 1 
0i 2 >„ = (3||e a || 2 - 5) (e 2 , - e 2 J, J 

016) a = ( e cty ~ e az) e ax-, 

017 ) a = ( e az ~ e ax) e ay, 

01»)a = { e ax ~ e ay) e az-. 


(A. 2a) 


(A. 2b) 


(A. 2c) 


(A. 2d) 


(A.2e) 


(A.2f) 


(A.2g) 


where a S {0, 1, ... , 18}. The corresponding nineteen moments {mp\/3 = 0, 1, . . . , 18} are arranged in the 
following order: 


I m) = (p,e,e,j x ,q x ,j y ,q y ,j z ,q z ,3p xx t ^ftxx 7 Pww 7 ft ww i Pxy ■> Pyz 7 Pxz 7 m x 7 ^ly 7 m z ) 
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There are fourteen vectors in the orthogonal basis set {| (bp)} with the same physical significance of the basis 
vectors in the D3Q15 model except for These fourteen vectors correspond to the following moments: 

p , e, e, j, q , and pij. In the D3Q19 basis set {|</>p)} there is no vector corresponding to the moment m xvz of 
equation (3.2f). Instead, there are five vectors which are not in the D3Q15 basis set: three vectors of cubic 
order (|<6ie) ? |<?>i7), and |0is)) and two of quartic order (|</>i 0 ) and | ^>1 2 ) ) - These five vectors are polynomials 
in |03) , 1 05 ) , and \(j> 7). The two vectors of quartic order, 1 0io) and <^12), have the same symmetry as the 
diagonal part of the traceless tensor Py , while other three vectors of cubic order are parts of a third rank 
tensor, with the symmetry of jkPnm- 
The diagonal collision matrix S is 

S = diag(0, Si, S2, 0, S4, 0, S4, 0, S4, Sg, S10, S9, S10, S13, S13, S13, Si6, Si6, Sie), 

and the transformation matrix M is given at the end of this appendix. Again, the 4t,h, 6th, and 8th row 
vectors of M (corresponding to j x , j y , and j z , respectively) uniquely define the ordering (or labelling) of the 
velocity set {e a } with respect to subscript a. 

With c 2 s = 1/3 and Sg = S13, the equilibria of the non-conserved moments are given as functions up to 
second-order in p and j as follows: 


e (eq) = 

-np + ^ 

Po 

+ 

i-H 

1 

II 

-(i 2 

„ \Jx 

Po 


+jf), 

(A. 3a) 

e (eq) = 

w t j 

w t p-\ - 

Po 

3 • 3, 



(A.3b) 

Q { : q) = 

2 . 

^ J X ? 

C = 

qi eql = 

2 . 

(A. 3c) 

r/eq) _ 
rXX 

^ - 

Ul+fJ ] > 

„(eq) 

rww 

1 

Po 

[£-£]. 

(A. 3d) 

r/eq) _ 
rxy 

1 . . 

JxJyi 

Po 

Pi) 

Z 1 

nleq) 

1 J XZ 

1 . . 

— JxJzi 

Po 

(A.3e) 

^(eq) _ 

n XX 

w xx p { x f\ 

7j-( ec l) — ■)/; 

n WW w XXVwW > 


(A.3f) 

mi eq) = 

-- = m< eq) = 0, 




(A.3g) 


where w e and w e j are again free parameters and w xx is an additional free parameter in the D3Q19 model. 

The bulk viscosity ( of the D3Q19 model is equal to that of the D3Q15 model given in equation (3.6b) 
and its kinematic viscosity v is 


1 

(~- 


1 



3 

\s& 

V 

- 3 

Ul3 

V 


(A.4) 


To recover the corresponding LBGK model, one must set w e = 3, w e j = —11/2, and w xx = —1/2. 
However, to attain an optimized stability of the model, we obtained the following parameter values through 
linear analysis (Lallemand & Luo 2000): w e = 0, w e j = —475/63, w xx = 0, Si = 1.19, s -2 = S10 = 1-4, 
S4 = 1.2, and sie = 1.98. With the above parameter values, we can use a maximum speed of 0.19 (Mach 
number M « 0.33) and a viscosity v > 2.54- 10 -3 in simulations. The linear analysis to obtain these ‘optimal’ 
parameter values is a local analysis of a system with a uniform velocity of wave- vector k. The local analysis 
does not consider boundary conditions and therefore the system may be less stable in actual simulations. 
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The transformation matrix M is given by 



1 
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1 

1 
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1 
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1 

1 
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